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Abstract 

Background: The developing mouse kidney is currently the best-characterized model of organogenesis at a 
transcriptional level. Detailed spatial maps have been generated for gene expression profiling combined with 
systematic in situ screening. These studies, however, fall short of capturing the transcriptional complexity arising 
from each locus due to the limited scope of microarray-based technology, which is largely based on "gene-centric' 
models. 

Results: To address this, the polyadenylated RNA and microRNA transcriptomes of the 15.5 dpc mouse kidney 
were profiled using strand-specific RNA-sequencing (RNA-Seq) to a depth sufficient to complement spatial maps 
from pre-existing microarray datasets. The transcriptional complexity of RNAs arising from mouse RefSeq loci was 
catalogued; including 3568 alternatively spliced transcripts and 532 uncharacterized alternate 3' UTRs. Antisense 
expressions for 60% of RefSeq genes was also detected including uncharacterized non-coding transcripts 
overlapping kidney progenitor markers, Six2 and Sail 1 , and were validated by section in situ hybridization. Analysis 
of genes known to be involved in kidney development, particularly during mesenchymal-to-epithelial transition, 
showed an enrichment of non-coding antisense transcripts extended along protein-coding RNAs. 

Conclusion: The resulting resource further refines the transcriptomic cartography of kidney organogenesis by 
integrating deep RNA sequencing data with locus-based information from previously published expression atlases. 
The added resolution of RNA-Seq has provided the basis for a transition from classical gene-centric models of 
kidney development towards more accurate and detailed "transcript-centric" representations, which highlights the 
extent of transcriptional complexity of genes that direct complex development events. 

Keywords: RNA-Seq, kidney development, microarray, Six2, Wtl, sense-antisense transcripts, alternative splicing, 
mesenchymal-epithelial transition, miR-214, microRNA 



Background 

The mammalian kidney is a remarkably complex organ 
at the cellular and functional level, being essential not 
merely for excretory functions but also for a variety of 
hormonal and homeostatic regulatory functions. A key 
structure is the nephron, which represents the func- 
tional excretory units of the kidney. During kidney 
development, the nephron arises via a reciprocal 
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interaction between a mesenchymal progenitor popula- 
tion and an adjacent epithelial ureteric tip, where the 
latter induces the former to undergo a mesenchymal-to- 
epithelial transition (MET), signaling the start of 
nephrogenesis (reviewed in [1,2]). Although well studied, 
the complete transcriptional regulatory networks are just 
beginning to be elucidated. 

Transcriptional profiling of the developing kidney 
using microarrays coupled with RNA in situ hybridiza- 
tions (ISH) have provided a detailed view of gene 
expression networks driving developmental processes 



o 



© 201 1 Thiagarajan et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative 
BiolVlGCl C6ntTcll Commons Attribution License (http://creativecommons.Org/licenses/by/2.0), which permits unrestricted use, distribution, and 
reproduction in any medium, provided the original work is properly cited. 



Thiagarajan et al. BMC Genomics 201 1, 12:441 
http://www.biomedcentral.eom/1 471-21 64/1 2/441 



Page 2 of 16 



[3-6]. Despite these advances, microarrays cannot cap- 
ture the entire transcriptional output from mammalian 
genes (reviewed in [7,8]) as they require a priori 
assumptions about the portion of the genome that is 
expressed, limiting the ability to use this technology for 
uncharacterized gene or transcript discovery [8]. This 
also applies to mRNA variants. On average, 6-7 different 
mRNA variants can arise from a single active locus [9], 
and this complexity includes alternate promoters, alter- 
nate 3' untranslated regions (UTRs), alternative exons, 
and alternative splice sites. The vast majority of this 
complexity is invisible to microarray probes, which are 
typically short (25-70 nt) and located in the 3' UTR of 
transcripts [10]. Such limitations mean that kidney 
developmental programs have only been explored at 
"gene-centric" resolution. Given the consequences of 
transcriptional complexity (alternate domain content, 
differential transcription factor binding sites and micro- 
RNA binding sites from alternative promoter and 3'UTR 
usage, respectively), understanding the complete reper- 
toire of transcripts is crucial for accurate modelling of 
kidney organogenesis. 

Massive-scale sequencing of transcriptomes (RNA- 
Seq) overcomes most of the limitations imposed by 
microarrays, and additionally offers high dynamic range, 
increased accuracy, and increased specificity [11-13], 
although not yet capable of single cell resolution. Appli- 
cation of this technology has enabled the identification 
of uncharacterized transcripts, genes, and non-coding 
RNAs (ncRNAs) [11,12,14,15], and in all studies, the 
level of complexity has been far higher than previously 
predicted. Although these features make it highly desir- 
able, RNA-Seq is not practical for all experiments, due 
primarily to laborious protocols and the need for large 
quantities of starting material. The recent application of 
single-cell RNA-Seq has allowed profiling of samples 
with limited quantities of sample such as embryonic 
development, but this technique did not discriminate 
strand-specific transcripts and did not detect 5' ends of 
transcripts longer than 3 kb which would hinder analysis 
of alternative promoter usage [16,17]. For the analysis of 
complex processes such as organogenesis where indivi- 
dual cellular components are difficult to separate, RNA- 
Seq to this level of resolution is not practical whereas 
gene expression profiling on whole organs may fail to 
detect subcompartment specific transcripts. The integra- 
tion of both types of analyses, however, may overcome 
the limitations of each, without the need of completely 
replacing current wealth of high-quality microarray 
datasets. 

In this study, we describe a high quality, stranded, 
polyadenylated RNA-Seq and microRNA (miRNA) -Seq 
profiling resource of the whole embryonic mouse kidney 
for the purpose of integrating with previously defined 



spatial resolution kidney microarray. In comparison to 
the microarray kidney atlas [5], we show that high cov- 
erage whole organ RNA-Seq is sensitive enough to both 
detect compartment-specific transcripts, and quantify 
transcript abundance relative to the whole organ. We 
have used this technique to assess the transcriptional 
complexity within the developing kidney subcompart- 
ments, identifying mRNA variants of many key kidney 
developmental genes. We also detect wide-spread sense- 
antisense transcription among important MET regula- 
tors, which we validated by SISH. Together, the datasets 
generated in this study advance gene-centric models of 
kidney development pathways towards more complete 
transcript-centric models, capturing the transcriptional 
landscape of gene expression. 

Results 

Deep sequencing of the 15.5 dpc mouse kidney 

The 15.5 dpc embryonic mouse kidney contains sub- 
compartments representing all progression of states dur- 
ing renal development [5], The total ribosomal-RNA 
depleted transcriptome (including miRNAs) of the 15.5 
dpc mouse kidney was surveyed using massive-scale 
stranded sequencing on the SOLiD platform. Approxi- 
mately 136 million high-quality, single mapping reads 
were mapped to the reference mouse genome (mm9) 
for the RNA-Seq library, and 788,931 uniquely mapping 
tags to known pre-miRNA hairpins (miRBase version 15 
[18]; (Table 1). Datasets are accessible from NCBI Short 
Reads Archive (SRA026710)). 

Quantifying embryonic kidney locus activity 

Sequenced Reads Per Kilobase per Million (RPKM) 
values [12] for RefSeq exon models were calculated and 
compared to a high-resolution kidney subcompartment 
microarray gene expression atlas [5]. 12,083 active pro- 
tein-coding loci (RefSeq "NM" ID's only) above 1 
RPKM were identified (Additional file 1). This compares 
to -5,300 microarray probesets representing 4,248 

Table 1 RNA-MATE and Galaxy tag mapping distribution 

Total tags 

Total tags mapping to genome (mm9) 
Total unique tags 

Number of RefSeq genes (> 1 RPKM) 
Number of transcripts (> 1 RPKM) 
Unique tags matching RefSeq NM exons 

Unique tags matching consensus gene exons models 
(RefSeq, Aceview, EnsembI, UCSC genes) 

Total unique junction tags 

Total unique miRNA tags 



329,923,262 

136,122,785 
(41.3%) 

107,339,260 
(32.5%) 

12,083 

15,527 

66,591,988 
(62%) 

82,841,356 
(77.1%) 

7,769,426 

788,931 
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RefSeq protein coding loci previously identified based 
on robust gene expression levels using the kidney sub- 
compartment atlas [5]. The majority of active loci were 
expressed at moderate to high levels (10-50 RPKM) 
(Figure 1A) with many key kidney developmental genes 
detected within this range. For example, Six2, a marker 
of the nephron progenitor population [19,20] was 
detected at 25 RPKM, and Wnt4 1 a marker of renal 
vesicle, at 16 RPKM. Low-level expressing transcripts 
such as Shh can also be detected in our experiment, at 
1 RPKM, which approximates to roughly 1-2 transcript 
per cell [12,21]. The RPKM standardization based on 
RNA-Seq tag count offers a sensitive and precise mea- 
sure of transcript abundance relative to the whole 
organ. 

Detecting rare, tissue-specific transcripts 

A major concern of whole organ profiling is the inability 
to detect rare, cell-type specific transcripts due to the 
heterogeneity of tissue composition [22]. In the pre- 
viously described microarray kidney atlas, this was 
addressed by profiling individual kidney subcompart- 
ments [5]. Subcompartment specific transcripts from 
that kidney microarray atlas were used to determine the 
sensitivity of tissue-specific transcript detection in whole 
organ RNA-Seq. As many as 99.7% of all transcripts 
attributed to major kidney subcompartments were 
detected, where the remaining discordant probe-sets 
were prone to cross-hybridizations as noted by probe- 
set ID suffixes (_s_at, _x_at, and a_at_ [23]) or generally 
had low raw signal (below 100 Raw Fluorescent Units) 
and therefore may be affected by background signal. 

In addition, subcompartment-specific transcripts pro- 
vided the framework to estimate the overall distribution 
of expression within kidney subcompartments. As 
shown in Figure IB, all major kidney subcompartments 
were represented, where the mean expression abun- 
dance for each compartment was between 1-10 RPKM. 
Rare (0.5 RPKM), subcompartment-specific transcripts 
detected by the kidney microarray atlas were also identi- 
fied by RNA-Seq. This confirms that with sufficient 
sequencing depth, whole organ RNA-Seq can be used to 
detect gene expression that are representative of specific 
kidney cellular populations. 

Integration of RNA-Seq with spatially-resolved Affymetrix 
microarrays 

After demonstrating that the RNA-Seq data was highly 
sensitive, we then wanted to integrate it with the spa- 
tial-resolution embryonic kidney microarray atlas and 
interrogate the transcriptional complexity driving mouse 
kidney organogenesis. Affymetrix Mouse 430.2 probe 
sets were aligned against the mouse genome (mm9) to 
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Figure 1 Embryonic kidney RNA-Seq coverage, depth and 
sensitivity. A: Tag distribution across active genes with varying 
levels of expression in the 15.5 dpc mouse kidney. Genes are 
grouped into reads per kilobases per million (RPKM) (y-axis) bins 
according to expression abundance based on tag coverage (x-axis). 
Low abundance genes are considered to have RPKM values 
between, 1-10 RPKM, moderate expression at 10-100 RPKM, and 
highly expressed at above 100 RPKM. B: Box-plot representation of 
embryonic kidney subcompartments captured by whole-kidney 
RNA-Seq profiling. Transcripts with the most subcompartment- 
specific expression from each structured identified from the 
embryonic kidney subcompartment microarray atlas (Brunskill et al. 
[5]) were represented by RPKM values (log 10) as detected by RNA- 
Seq to gauge sensitivity of detecting specific embryonic kidney cell- 
types. Each box represents kidney subcompartment-specific 
transcripts with corresponding RPKM values; The boxes extend from 
the 25 th percentile (lower hinge) to the 75 th percentile (upper 
hinge) of RPKM values. The line across the box represents the 
median. The lengths of the lines above and below the box are 
defined by the maximum and minimum RPKM values (respectively). 
Subcompartments: CI: cortical interstitium; Cap: cap mesenchyme; 
Ml: medullary interstitium; Utip: ureteric tip; CCD: cortical collecting 
duct; MCD: medullary collecting duct; RV: renal vesicle; SSB: s- 
shaped body; RC: renal corpuscle; PT: proximal tubule; LOH: loop of 
Henle. 
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define the boundaries of captured expression. The probe 
set genomic coordinates were then used to overlay sub- 
compartment specific expression as a heatmap-based 
UCSC data track (Figure 2 A, B). This revealed presence 
of probe sets that can be used to capture expression 
beyond annotated gene boundaries, which provides 
excellent spatial resolution for events such as extended 
3'UTR expression (Figure 2), while non-coding RNA 
transcripts can also be captured by multiple or pre- 
viously unassigned probe sets (Additional file 2). Con- 
current use of the UCSC Genome Browser heatmap 
tracks with RNA-Seq tracks therefore provides spatial 
identification for any transcriptional complexity which 
overlaps the microarray probes. 

Extensive use of extended 3'UTRs in embryonic kidney 
subcompartments 

The 3' UTR contains ds-regulatory elements important 
for mRNA stability, degradation, subcellular localization 



and translation. Therefore, accurate characterization of 
3'UTR boundaries can help identify key regulatory ele- 
ments, such as microRNA (miRNA) binding sites. In 
order to identify expression beyond currently annotated 
3'UTR boundaries, we used a sliding window to survey 
contiguous signal within a 20 kb radius from the anno- 
tated 3' end (excluding regions overlapping known 
RefSeq transcripts including ncRNAs). This approach 
identified over 1500 genes with 3'UTRs that extend well 
beyond the mouse RefSeq boundary. Extended UTR 
sequence genomic coordinates identified by RNA-Seq 
were obtained from mm9 using Galaxy [24] to deter- 
mine if such events were novel or due to incomplete 
annotations. We found that 720 instances of these 
extended UTRs have been seen in RefSeq orthologs, 
often as part of the transcript of genomes with more 
complete annotations such as human RefSeq (hgl8). 
Overall we find 532 transcripts with previously unanno- 
tated 3'UTR extensions, demonstrating the widespread 
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Figure 2 Visualization of RNA-Seq and kidney subcompartments microarray data on UCSC Genome Browser of Lhxl long 3'UTR. 

Representation of the 3' end of the mouse Lhxl gene (chrl 1:84330068-84335347) (mm9) is shown within the genome browser along with 
default and custom tracks. A: Affymetrix mouse 430.2 microarray platform probesets 1421951_at localized to the canonical 3' untranslated region 
(UTR) and 1450428_ot -500 bp downstream; and corresponding probeset expression heatmap across kidney subcompartments (microarray data 
from [5]) microarray compartments from top to bottom of heatmap: ureteric tip; s-shaped body; proximal tubule; cortical, and medullary 
interstitium; medullary, and cortical collecting duct; renal corpuscle; cap mesenchyme; loop of Henle; renal vesicle. B: RNA-Seq exon junction 
tags are represented as UCSC Genome Browser BED data tracks (top) spanning exons, and 'wiggle' plots showing coverage of negative strand 
tags corresponding to Lhxl expression (bottom). C: Riboprobes used for in situ hybridization (ISH): i) overlapping the canonical region as 
represented by Affymetrix probeset 1421 951 _at and ii) overlapping extended 3' signal captured by RNA-Seq and probeset 1450428_at, which 
also contains a microRNA binding site for miR-30 [28]. D: Histological 15.5 dpc mouse kidney section ISH (SISH) of canonical 3'UTR (i) and 
extended 3'UTR (ii) both detected in distal compartments of the renal vesicle. E: Pre-built UCSC genome browser data tracks of: (top-bottom) 
mouse RefSeq genes, Ensembl gene model predictions, mouse expressed sequence tags (ESTs). Green tags represent EST tags derived from 
kidney cDNA libraries, and evolutionarily conserved regions (black). 
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nature of this transcriptional event in the embryonic 
kidney (Additional file 3). 

We then asked whether extended 3'UTR expression 
was prevalent among genes critical for kidney develop- 
ment by focusing on genes involved during mesenchy- 
mal-epithelial transition (MET), which is a critical 
process for nephron development. Extended UTR 
expression was detected within the Lhxl locus, a critical 
transcriptional regulator of nephron endowment [25,26] 
(Figure 2). A -1.5 kb signal beyond the RefSeq anno- 
tated 3' end was detected and represented by probesets 
1421951_at (canonical 3'UTR based on RefSeq models) 
and 1450428_at (extended UTR) with high concordant 
expression (Pearson correlation R = 0.932). Section ISH 
(SISH) also confirmed the concordant expression 
between the extended 3'UTR and the remaining por- 
tions of the transcript, localized to the nephron precur- 
sor structures (renal vesicle, s-shaped body and nephron 
tubules) (Figure 2C, D). SISH data and detailed annota- 
tions are available at [27]. Studies have described 
miRNA binding sites for miR-30 within the extended 
region of Lhxl 3'UTR, where miR-30 inhibits Lhxl 
expression and therefore embryonic kidney differentia- 
tion [28] (Figure 2C). This region overlaps with the 
extended signal detected in our RNA-Seq data, high- 
lighting the importance of accurate representation of 
gene boundaries. 

Alternate exon usage associated with key kidney 
development loci 

Large scale identification of alternative splicing is an 
essential pre-requisite that will facilitate important 
downstream functional characterization on how genes 
are regulated in a tissue-specific manner and the roles 
of alternate isoforms during developmental states. Alter- 
native splicing can alter mRNA through a variety of 
mechanisms, including the addition and removal of 
exons, thereby affecting protein functional domain com- 
position [29]. To identify the presence of isoforms asso- 
ciated with alternate exon usage, reads were mapped to 
a predefined library of known exon junctions sequences, 
as described in [11,30]. Results from the mapping 
revealed 3568 loci (> 1 RPKM) where alternate exon- 
junctions were detected (Additional file 4). 

To gauge our effectiveness in detecting transcriptional 
complexity arising from key loci, we reviewed the tran- 
scriptional output from key kidney development genes 
and detected previously known variants (Table 2). For 
example, Ret isoforms, RetSl and Ret9 which have dif- 
ferent temporal requirements during the developing kid- 
ney, were identified through tags spanning exon-exon 
junctions and expression tags, where differential expres- 
sion was observed at the C-terminal tails as previously 
reported [31] (Additional file 5). 



In addition, uncharacterized splicing events were also 
detected. In Wtl, two main splicing events have been 
previously identified and characterized: splicing of exon 
5 and exon 9 +/-KTS domain [32]. Together with three 
known alternate transcriptional start sites, up to 24 Wtl 
protein isoforms are predicted with the ratio of isoform 
abundance proposed to be critical for normal develop- 
ment [33]. The RNA-Seq dataset detected both pre- 
viously described alternate splicing events together with 
a novel isoform lacking both exons 4 and 5 [Ensembl 
Transcript: ENSMUST00000111100, Ensembl protein: 
ENSMUSP00000106729] (Figure 3A), where expression 
has been confirmed by qRT-PCR (Additional file 6). 
Previously, isoforms lacking exon 4 have only been 
reported in kidneys of aquatic/semi-aquatic animals 
including eel, medaka, and turtle [34-36] with such iso- 
forms proposed to represent an event no longer 
required for mammalian metanephric kidney develop- 
ment. Our data would question this conclusion. Alter- 
nate donor-acceptor splice sites (GT-AG) across exon 
junctions were also detected among key kidney develop- 
ment regulators such as Six2 and Wnt4)See (Table 2). 

Temporo-spatial loci with uncharacterized 5' exons and 
alternative promoter signal 

Alternative promoters, including those associated with 
alternate 5'exon usage, can be activated in a tissue-speci- 
fic manner. For example, a Nephrin (Nphsl) isoform 
with exon la is detected in kidney and plays an impor- 
tant role in renal filtration [37] while the variant with 
exon lb is only detected in brain [38]. Presence of alter- 
native promoters associated with key temporo-spatial 
kidney development loci warrant further subsequent 
experimental validation to determine its potential role 
during gene expression regulation. To identify alterna- 
tive promoters, the most 5' exon junction tags beyond 
the RefSeq gene models were screened for evidence of 
alternate or complex promoter usage. A minimum cut- 
off of 10 tags at each candidate junction was required 
which returned a total of 374 alternate exons associated 
with 187 genes (Additional file 7). Alternative 5' usage 
was detected among four key kidney development regu- 
lators (Table 2); including a shorter novel promoter for 
Sail lj an early inducer of kidney development, sup- 
ported by RNA-Seq signal (See Figure 4B). Alternative 
5' exon junctions in Salll were also detected, and this 5' 
complexity could be due to the multiple expression sites 
of this gene. Salll expression is detected during initial 
stages kidney development and subsequently expressed 
in nephron progenitors, but also in the and the subse- 
quently formed early nephron epithelium [39]. Extended 
promoter signal -12 kb beyond the RefSeq annotated 
start site was also detected for Pax2 (Figure 3B) which 
is expressed in both the ureteric epithelium and 
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Table 2 Transcriptional complexity and discovery across regulators of kidney development 


Gene 


Known variants (mouse/human) 


Variant Junction Location 


Type 


Supporting transcript models 


Number of tags 


Pox2 


NC 




Alt 57Promoter 




Signal 




Pax2a/b 


chrl 9:44865283-44890407 


Cassette Exon 


ENSMUST000001 1 1979 


13 






chrl 9:44831 91 7-44835374 


Donor/Acceptor 


Pax2.bSep07 


13 




NC 


chrl 9:44909958-4491 0469 


Donor/Acceptor 


N/A 


10 


Wt1 


NC 


rhr>1 04Q73fi59-1 0^n034Q1 

V.I II 1 Ut7/ JDJ£ 1 \J JUU Jt7 1 


^kin oynn 4 Ri^ 
JM|J cauii t Ocj 


fn<;i\/u i<>Tononoi 1 1 1 nn 


1 0 




Wt1 -exon 5 


chr2:1 0498331 0-1 05003491 


Skip exon 5 


ENSMUST000001 1 1 101 


350 




+/- HTS 


chr2:1 0501 01 57-1 0501 2389 


Donor/Acceptor 


ENSMUST000001 39585 


83 


Salll 


Isoform A (long) & B (short) 


chr8:91 557288-91 566260 


Alt. 5'/promoter 


A: (hg19)NM 002968 
B: (hg19)NM_001 127892 


9 




NC 


chr8:91 566334-91 567384 


Overlapping Exon 


Salll. dSep07 


5 


Eyal 


NC 


chrl :1 4294546-1 4294663 


Alt. Exon 


Eya1.fSep07 


3 




Isoform 1-4 


chrl :1 4273270-1 42945 1 5 


Cassette Exon 


ENSMUST00000080664 


10 






chrl :1 426091 4-1 42641 55 


Donor/Acceptor 


ENSMUST00000027066 


32 






chrl :1 426421 5-1 4264624 


Donor/Acceptor 


Eya1.hSep097 


4 






chrl :1 4264279-1 4264624 


Donor/Acceptor 


Eya1.aSep07 


35 


Gdnf 


Isoform 1-2 


chrl 5:7760047-7787580 


Alt. 57promoter 


Gdnf.aSep07 


Signal 






chrl 5:7765678-7784357 


Donor/Acceptor 


Gdnf.bSep07 


5 


Ret 


Ret51 (long) 


chr6:1 18104019-1 18105315 


Retained intron 


NM_009050 


4 




Ret9 (short) 


- 


Overlapping Exon 


NM_001 080780 


Signal 


Wnt11 


Isoform A & B 


chr7:1 05983621 -106002321 


Alt. promoter 


Wnt11.cSep07 


Signal 






chr7:1 05987691 -105994975 


Donor/acceptor 


Wnt11.cSep07 


5 


Bmp7 


NC 


chr2:1 7269351 3-1 72766073 


Alt. exon 


Bmp7.aSep07 


Signal 


Pax8 


NC 


chr2:24298651 -24300095 


Donor/Acceptor 


ENSMUSTOOOOOl 29538 


3 




Isoform C 


chr2:24291 401 -24291 977 


Donor/Acceptor 


ENSMUST000001 02940 


8 


5ix2 


NC 


chrl 7:86084844-86086736 


Donor 


N/A 


6 


Fgf8 


Isoform 2 & 3 


chrl 9:4581 61 60-4581 641 0 


Cassette Exon 


NM_001 166361; NM_001 166362 


4 


Wnt4 


NC 


chr4:1 36845255-1 36851 407 


Acceptor 


Wnt4.bSep07 


4 



(NC- not characterized) 



mesenchyme [40]. This promoter region encompasses a 
4.1 kb minimal promoter that is only expressed in ure- 
teric bud epithelia [41]. As the prediction of transcrip- 
tion factors (TFs) that regulate a cohort of genes 
requires the precise determination of the potential pro- 
moter region, using the standard promoter regions 
based on RefSeq gene models in these analyses may lack 
sensitivity. Incorporation of this RNA-Seq derived infor- 
mation into TF binding site predictions should uncover 
TF regulators of importance to the developing kidney 
and also aid in the design of promoter-reporter green 
fluorescent protein (GFP) constructs in transgenic mice 
to understand mechanisms regulating tissue- and cell- 
specific expression. 

Sequencing of embryonic kidney miRNAs 

MiRNAs are short, non-coding species of RNA (~22nt) 
that function as translational repressors of target 
mRNAs during many biological processes including 
development, differentiation, cell proliferation and dis- 
ease [42,43]. Within the kidney, tissue-specific knockout 



of Dicer, an enzyme required for miRNA biogenesis, has 
previously been reported to alter anatomical organiza- 
tion and to also play a role in renal diseases [44-46]. 
Identification of the complete miRNA repertoire in the 
embryonic kidney will serve as an important reference 
of developmentally regulated miRNAs for functional 
characterization. To catalogue active miRNAs within the 
developing mouse kidney, we have isolated and 
sequenced the small RNA fraction (SOLiD, Applied Bio- 
system) and mapped the reads against the entire miR- 
Base (vl5) database [18]. This provided the 
identification of over 170 microRNA families with high 
quantity of mapped tags (> 100 tags) (Additional file 8). 
MiR-30 was abundantly detected in our miRNA-Seq 
dataset, where it has been previously shown to be a cri- 
tical regulator of kidney development [28] . The miR-200 
family was also abundantly detected in the embryonic 
kidney which is likely due to its role in MET regulation 
[47,48]. Functional characterization of many more of 
kidney miRNAs identified by miRNA-Seq will be 
required to infer roles during organogenesis. 
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Figure 3 Transcriptional complexity of kidney development regulatory genes. A: Evidence of known and novel exon splicing in Wt1 
positive strand. Exon junctions tags (> 3 tags) representing differential exon usage. Novel splicing event involving exons 4 and 5 is marked with 
Canonical RefSeq and supporting Ensembl gene models of predicted isoforms are shown. B: te2-locus with spliced exon 6, represented by 
exon junction tags (> 3 tags), resembling PAX2 RefSeq human isoform, shown below the mouse RefSeq track. Expression beyond mouse RefSeq 
gene boundaries was also captured. Exons are numbered below mouse RefSeq models. 

V J 



Mesenchymal-specific expression of miR-21 4/Dnm3os in 
the developing kidney 

One of the first steps to gain insights into the biological 
role of miRNAs is to determine tissue localization. SISH 
studies based on mature miRNA sequence hybridiza- 
tions can be challenging due to the limited unique 
sequence content of these short molecules. To overcome 
this, several studies have described using miRNA pre- 
cursor genes, known as primary transcripts (pri- 
miRNA), as a proxy to monitor expression of nested 
miRNAs [49,50]. Kidney miRNAs from the miRNA-Seq 
data were matched to corresponding intergenic noncod- 
ing pri-miRNAs (as annotated by Saini HK et al. [51]), 



that was also expressed in the mRNA-Seq data. We 
identified 22 highly expressed intergenic pri-miRNAs 
hosting kidney miRNAs including the Wilms tumor 
(renal neoplasm)-associated and imprinted transcript, 
H19, [52] a precursor for mir-675 [53] and the mir-17- 
92 cluster Mirhgl pri-miRNA, with the latter being 
involved in embryonic lung proliferation and differentia- 
tion [54] (Additional file 9). 

Next, we identified pri-miRNAs that were represented 
by Affymetrix 430.2 probeset from the kidney subcom- 
partment atlas microarray data (Additional file 10). Of 
these probesets, three were co-incidentally positioned to 
overlap the embedded miRNAs within the primary 
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transcript (let-7b:1440357_at; miR-425:1459927_at; miR- 
214: 1427298_at). Of these, miR-214 from the Dnm3os 
host gene provided the most reliable probe set expres- 
sion profile. Dnm3os has been described to serve impor- 
tant roles during embryo development [55,56] although 
it has never been described within the context of the 
kidney. Micorarray probeset expression was detected in 
all interstitial mesenchyme subcompartments except the 
Six2 + nephron progenitor population (Figure 4B). SISH 
validation of Dnm3osl 'miR-214 confirmed the interstitial 
mesenchyme specific expression profile but was also 
detected in the cap mesenchyme (Figure 4B and [GUD- 
MAP:10816]). Further validation will be required to 
determine which cellular population of the cap 
mesenchyme miR-214 is restricted to and whether it is 
distinct from the Six2 population. 

Widespread expression of sense/anti-sense transcripts 
pairs in the embryonic kidney 

The strand specific information of our RNA-Seq data 
enabled a genome-wide survey of sense-antisense tran- 
scription. Overlapping sense and antisense transcription 



has been described in a variety of biological roles, 
including RNA editing, genomic imprinting, transla- 
tional regulation, RNA interference [57-60]. Current 
lists of validated sense-antisense pairs include many 
important developmental genes such as Pax2 and 
Hoxall [61]. Within the kidney, the noncoding anti- 
sense WT1 transcript (WT1-AS) shares the same 
expression domains as WT1 and therefore is consistent 
with its role as a positive regulator of WT1 protein 
levels [62]. Many splice-forms of WT1-AS have been 
characterized, where defects in the splicing machinery 
are implicated with acute myeloid leukaemia [63]. Sur- 
vey of sense-antisense transcript pairs in the 15.5 dpc 
kidney identified 59.7% of expressed RefSeq transcripts 
with corresponding coding and non-coding antisense 
partners (Additional file 11) where only 2654 have been 
previously documented in the Natural Antisense Tran- 
script Database (NATsDB) [64]. Antisense transcripts 
were detected for several kidney developmental genes, 
including Wtl [62], Salll, Pax2 [65]Lhxl, Six2, Hnflb, 
Emx2 [66] and Wnt7b, where the majority overlapped in 
a head-to-head orientation. Examples of tail-to-tail 
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Figure 4 Mesenchyme-specific expression of host gene Dnm3os for miR-214. Dnm3os ncRNA host gene for microRNAs, miR-199 and miR- 
214. A: Affymetrix probeset 1427298_at directly overlaps miR-214. Microarray compartments from top to bottom of heatmap: ureteric tip; s- 
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(Wnt9b) and embedded overlaps (Tcf21) were also 
detected. Only a few of these kidney development anti- 
sense transcripts (e.g Lhxl) were represented on the 
Affymetrix platform. 

To determine if antisense transcripts were spatially 
associated with the kidney development-associated sense 
transcript counterpart, high resolution SISH was 



performed on a small subset of these candidates. All 
three antisense transcripts for Six2, Salll, and Lhxl 
showed correlated subcompartment expression to sense 
counterpart although possibly at varying levels of inten- 
sity (Figure 5 and see also [GUDMAP:8504] for Lhxl 
antisense (1500016L03Rik) validation). The previous 
association between head-to-head orientation and 
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Figure 5 Histological sections ISH (SISH) comparative analyses of sense and uncharacterized antisense transcripts expression. SISH 
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positive regulation of expression would agree with the 
higher intensity of expression of both sense and anti- 
sense Salll expression in the early nephrons as opposed 
to the lower levels of expression in the cap mesenchyme 
nephron progenitors (Figure 5B). Detailed annotations 
of SISH images are available at [27]. The identification 
of antisense transcription further validates the preva- 
lence of natural antisense transcription in the genome 
[60], and is likely to contribute to the regulation of kid- 
ney developmental programs. 

Transcriptional complexity during mesenchymal-epithelial 
transition 

Representations of biological networks and pathways 
typically report a gene as a single node, neglecting fea- 
tures of transcriptional complexity. To assess the extent 
of transcriptional complexity within kidney development 
networks, we surveyed the transcriptional complexity 
during MET program. This critical renal development 
event is paramount for normal renal function and dis- 
ruption can alter nephron number which in turn predis- 
poses individuals to kidney diseases [2]. A current 
review of kidney development describes 17 well-charac- 
terized loci [2] as being involved in this MET event. 
However, like many such reviews, this is gene-centric in 
nature. Our data shows extensive transcriptional com- 
plexity associated with all but two of the described MET 
developmental genes (Figure 6), and we have described 




Figure 6 Transcriptional complexity of the mesenchymal- 
epithelial transition network. Transcriptional complexity 
associated with the 17 most characterized mesenchymal-epithelial 
transition pathway (MET) genes. Genes that have evidence of 
alternative splicing include alternate exon usage, alternate 5' and 3' 
exons highlighted with black circle. Genes with long 5' and/or 3' 
UTR signal are represented by white circles and antisense transcript 
in blue circles. Literature evidence of microRNA association is 
represented for Lhxl (miR-30) and Hoxall (miR-181) along with 
other known transcriptional regulatory relationship (dotted arrows). 
Figure modified from Little et al [2]. 



the transcriptional landscape of this crucial biological 
process. 

For eight loci with evidence for alternative exon usage, 
we scanned for changes in the protein domain composi- 
tion to infer functional changes. Out of the four RefSeq 
canonical isoforms for Fgf8, two isoforms (variant 2 and 
3) were detected in the kidney, which differed in pre- 
sence or absence of exon 4 [67]. Removal of this exon 
excludes the signal-peptide normally associated with this 
growth factor, presumably leading to an intracellular 
protein with a different biological role. This may have 
implications for the formation of the renal vesicle, the 
first stage of nephron induction, where FgfS is expressed 
and has assumed to act as a secreted protein. 

Alternative 5' ends were identified for the Gdnf, Pax2, 
Eyal and Wntll loci. In humans, EYA1 is associated 
with three isoforms differing at the first exons [68]. In 
addition, RNA-Seq provided evidence for an additional 
uncharacterized exon between exon 1 and 2 of the 
canonical Eyal RefSeq transcript EST tag evidence and 
gene models (Aceview: Eyal.fSep07). In the Pax2 locus, 
signal extending the 5' end as far as 10 kb provided 
compelling evidence for an alternative promoter signal 
beyond the current gene models. 

Signal flanking 3' ends for genes such as Pax2, Bmp7, 
Wnt4 and Lhxl mouse RefSeq models were supported 
by more complete gene models such as the human 
RefSeq transcripts and other gene prediction models. 
SISH validation of the observed Lhxl and Wnt4 3' 
extensions confirms these events as an extension of the 
primary transcript and highlights the need for updated 
gene models. 

Surprisingly, natural antisense transcripts were 
detected for 10/17 MET genes. Several antisense tran- 
scripts have previously been identified, such as Emx2os 
[66] and WtlAS [62] where both antisense has been 
shown to positively regulate the respective sense tran- 
script expression. SISH analyses of novel antisense 
expression for Six2, Salll and Lhxl show concordant 
expression patterns with sense counterpart. Sense-anti- 
sense pairs identified for MET genes were arrayed in a 
head-to-head overlap at the 5' end which may be indica- 
tive of a bidirectional promoter, similar to Wtl -AS. 

To infer candidate miRNAs involved in MET, we 
scanned the literature for MET genes with experimental 
evidence of miRNA target regulation. Only Lhxl has 
been characterized as target of miR-30 within the con- 
text of kidney development [28]. Other MET genes have 
had characterized miRNA regulation in other tissue 
types, including regulation of Hoxall by miR-181 dur- 
ing muscle differentiation [69], and hypoxia-induced tar- 
geting of Fgfrll by miR-210 [70]. Such transcript-centric 
models reveal the undocumented layer of complexity 
associated with current models of regulatory networks 
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which should be incorporated into functional validations 
studies. 

Discussion 

Embryonic kidney development requires a high level of 
transcriptional co-ordination to form at least 25 known 
distinct cell types required to carry out specific renal 
functions. We have described here the first RNA-Seq 
profiling of whole embryonic mouse kidney and have 
integrated this information with previous microarray 
and SISH based atlases of expression during kidney 
development. What we show is that RNA-Seq offered 
detailed transcriptional profiling beyond the locus 
expression activity offered by most microarrays. 

A major concern of whole organ profiling relates to 
the disproportional representation of all cell types in 
such complex cellular systems. Transcriptional profiling 
of whole organs using microarray has been problematic 
due to the heterogeneous tissue composition and pro- 
portions, which can overshadow differential gene expres- 
sion of less abundant cell types [22]. Given the 
potentially unlimited dynamic range, RNA-Seq should 
overcome this hurdle. We demonstrate here that at suf- 
ficient depth, whole kidney transcriptome profiling by 
RNA-Seq can provide the resolution and coverage to 
detect over 99.7% of subcompartment-specific tran- 
scripts. Transcriptional output from each major sub- 
compartment was also shown to be evenly distributed 
across the data based on subcompartment-specific tran- 
script expression, with RNA-Seq detecting both abun- 
dant (above 10 RPKM) and low-level tissue-specific 
transcripts (below 1 RPKM). Despite this, it is important 
to note that the lack of normalization approaches for 
RNA-Seq, makes identification of rare, cell-type specific 
transcripts challenging, as highly expressed transcripts 
would obtain the most tag coverage. 

The sensitivity of RNA-Seq makes whole organ profil- 
ing ideal for integration with pre-existing microarrays of 
kidney cell-types to achieve single nucleotide- and spa- 
tial- resolution of transcriptional complexity. Not all 
events detected in the RNA-Seq could be represented by 
Affymetrix probesets (i.e. alternative exon and 5' promo- 
ters) due to the 3'end bias of the Affymetrix 430.2 pro- 
beset design. The 3' end bias was instead ideal for 
survey of differential subcompartment localization of 
extended 3'UTRs and detecting occasional ncRNA tran- 
script expression. 

Overall, RNA-Seq profiling captured a wide range of 
transcriptional complexity during kidney development. 
These events were highlighted among a subset of well 
established kidney developmental genes throughout the 
study revealing new insights. For example, while alterna- 
tive splicing of the Wtl locus in the kidney has been 
extensively documented, we detected a uncharacterized 



mouse in-frame isoform without exons 4 and 5. This 
isoform was supported by the Ensembl mouse predicted 
transcripts but has only been reported in fish and turtles 
[34-36]. These two exons together encode a putative 
leucine zipper motif, located at the N-terminal region of 
Wtl [34], which has been previously shown to contain 
protein-protein association domains [71]. This region 
allows Wtl isoforms to self-associate, whereby removal 
of exon 4 and 5 would alter the dimerisation of WT1 
protein isoforms and their ability to interact with other 
proteins [71]. 

The strand-specific nature of our RNA-Seq enabled 
sense-antisense transcript annotations. Although various 
techniques confirmed widespread presence in the mam- 
malian genome [60,72,73], detection and identification 
of low abundance antisense transcripts, a common trait 
of antisense RNA, remained challenging due to sequen- 
cing depth limitations from these technologies [74]. The 
sequencing depth and strand-specific nature of RNA- 
Seq facilitated the use of a liberal approach for the iden- 
tification of many sense-antisense transcripts including 
low-copy number antisense transcripts. In the analysis, 
several transcription factors critical for MET were asso- 
ciated with overlapping antisense ncRNA transcript 
expression. Many of these antisense ncRNA show syn- 
expression patterns with the sense pair as during SISH 
validation including the uncharacterized antisense for 
Six2, a marker of the renal progenitor cell population. 
The orientation is reminiscent of the Wtl antisense 
(WT1AS), which has been shown to positively regulate 
WT1 protein expression levels [62] through a bidirec- 
tional promoter. Hence, this may also be true for the 
Six2 and Salll sense/antisense transcripts. Further func- 
tional validations will be required to determine anti- 
sense-mediated regulation for these key protein-coding 
genes. 

MiRNAs have been shown to play an active role dur- 
ing embryonic development however individual miR- 
NAs required for kidney development remains largely 
unexplored. To address this, the miRNA population 
from the embryonic kidney sample was isolated and 
sequenced to serve as a reference for the entire, 
embryonic kidney miRNA repertoire. Next, we asso- 
ciated subcompartment localization of miRNAs from 
intergenic pri-miRNA expression. We focused on Affy- 
metrix probesets that directly overlapped with the 
embedded miRNA, which lead to the identification of 
miR-214 from the Dnm3os transcript. Both SISH 
riboprobe and Affymetrix probeset expression profiles 
detected expression in all kidney mesenchymal/intersti- 
tial subcompartments except cap mesenchyme, where 
it was detected during SISH but down-regulated in the 
microarray profile of the Six2+ cap mesenchyme 
population. 
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Six2 is a marker of the nephron progenitor popula- 
tion, which maintains progenitor renewal by preventing 
epithelial differentiation during MET. The inhibitory 
nature of miRNA, through miR-214, may reflect a role 
in suppressing self-renewal and therefore promoting dif- 
ferentiation. This hypothesis aligns with the previously 
described role of miR-214 as a promoter of cellular dif- 
ferentiation of skeletal muscle cells. miR-214 has also 
been shown to promote ES cell differentiation via the 
regulation polycomb group proteins [75] and by modu- 
lating Hedgehog signalling [76]. In the kidney, Shh, part 
of the Hedgehog signalling pathway, is required for 
mesenchymal proliferation and differentiation of smooth 
muscle progenitor cells [77]. This gene may also be 
regulated by miR-214. 

Almost all the genes involved in the MET pathway 
show some form transcriptional complexity, which is 
largely unaccounted for during functional characteriza- 
tion of many of these loci. Hence, our findings now pro- 
vide an opportunity to move towards transcript-centric 
models of biological pathways and networks in kidney 
organogenesis. 

Conclusions 

In conclusion, this dataset provides a valuable resource 
with which to interrogate transcriptional control of kid- 
ney development. Integration of the RNA-Seq data with 
pre-existing resources such as tissue-specific microarrays 
and SISH provides a dynamic atlas of the spatial and 
transcriptional regulation of a developing organ, thereby 
representing an ideal baseline for comparative studies 
into kidney development abnormalities. Specifically, our 
analyses highlight new transcriptional components active 
during key stages of kidney development that can now 
be prioritized for further functional characterization. 

Methods 

Library Prep and Sequencing of mRNA and miRNA 

Total RNA (10 ug) from 46 embryonic kidney (15.5 dpc) 
from 5 litters of CD1 mice was put through one round 
of poly (A) selection (Oligotex Kit, Qiagen) followed by 
ribosomal depletion (Ribominus Kit, Invitrogen) to 
select mRNA. The enriched mRNA was fragmented by 
digestion with RNaselll (Ambion), and purified on a 
Microcon YM30 column (Microcon). Fragmented 
mRNA was used to generate libraries as specified in the 
Whole Transcriptome Analysis Kit (Ambion) protocol 
for mRNA and Short RNA Expression (SREK). The 
SREK library was barcoded (barcode: Series A, Applied 
Biosystems) and pooled. Emulsions PCR (8x) and large 
scale enrichment (LaSE) was carried out as outlined in 
the SOLiD 3 Plus template bead preparation manual. 
Sequencing was carried out on SOLiD system 3.5 and 
v3.5 chemistries to produce DNA sequence reads of 35- 



50(nt). Datasets available via the NCBI Short Read 
Archive (SRA026710). 

Mapping and Analysis 
mRNA-Seq mapping 

Mapping of SOLiD sequencing reads was performed 
using a recursive mapping strategy using RNA-MATE 
vl.l [30] under default settings. Reads were mapped to 
the mouse genome (mm9) and a library of exon-exon 
junctions derived from gene models such as RefSeq, 
UCSC known genes, Ensembl, Aceview as previously 
detailed in [11]. Resulting mapped tags were presented 
as wiggle plots' (bedGraph data format) of tag abun- 
dance for visualization in UCSC Genome Browser. The 
mapped tag starts sites files from (the RNA-MATE out- 
put) were used to calculate tag frequency counts against 
RefSeq gene models. 
RPKM normalization 

Non-redundant RefSeq protein coding loci genomic co- 
ordinates was provided as BED files from the UCSC 
Genome Browser curation team. Tag start files were 
used to calculate expression as detailed in RNA-MATE 
manual. RefSeq gene reads per kilobases per million 
(RPKM) calculation was performed in Galaxy [24] and 
as detailed in [10]. 

Genome-wide identification of alternative exon and 
alternative 5' exon usage 

A minimum of 2 tags were used to consider candidate 
alternate exon-exon junctions events overlapping 
RefSeq gene canonical junctions. As this produced a 
large list, we reduced the list to report only alternate 
exon-exon junction tags with > 5 tags in Additional 
file 4. For alternative 5' exon usage, we used a strin- 
gent cutoff, > 5 tags. This is to circumvent weaker sig- 
nals in the 5' end arising from 3' bias arising from 
RNA-Seq protocols [7]. 
Extended 3' UTR 

Tags mapping downstream of the 3'UTR boundary of 
RefSeq and UCSC Genes were analysed in 30 bp win- 
dows along a 20 kb (non-overlapping) radius. Presence 
of extended 3'UTR was calculated for genes above 
1RPKM. Expression beyond the 3'UTR of RefSeq gene 
models were required to: a) be greater than 50% of the 
RPKM value b) have expression in any 10 consecutive 
30 bp sliding window, and c) have expression extended 
greater than 500 bp. 
Sense-Antisense transcripts 

Antisense expression were annotated against RefSeq 
transcripts coordinated obtained from the UCSC Gen- 
ome Browser (mm9). Antisense partners were required 
to have expression greater than 10 RPKM. Reads were 
required to map on the opposite strand of the RefSeq 
transcript, within the annotated coding or untranslated 
regions. 
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MiRNA-Seq mapping 

Small RNA sequencing tags were aligned against miR- 
Base vl5 pre-miRNA hairpins using miRNA-MATE, an 
open source alignment tool designed in our laboratory 
specifically for colour-space miRNA analysis (http:// 
grimmond.imb.uq.edu.au/miRNA-MATE/; manuscript 
in preparation). miRNA-MATE uses the recursive style 
of matching, described in Cloonan et al [30], for sensi- 
tive miRNA expression detection, but also can identify 
and strip the adaptor to determine the precise ends of 
the captured miRNAs. During alignment, up to 2 mis- 
matches were allowed, treating valid-adjacent mis- 
matches (those colour-space mismatches when located 
side-by side, indicate the presence of a single nucleotide 
variant) as a single mismatch. 
Comparisons against Affymetrix probesets 
Probesets were created from a consensus sequence 
obtained from NetAffx [78]. The consensus sequence 
was mapped to the mm9 genome using blat using 
default parameters. Scoring of an alignment is based on 
UCSC Genome Browser Guidelines [79]. If a consensus 
sequence matches two or more locations with the same 
highest score, both multi-mapping consensus sequences 
were included. Individual probes from each Affymetrix 
probeset was mapped to a library of consensus probeset 
sequence obtained from NetAffx. Probesets were then 
represented onto the genome based on the consensus 
sequence mapping coordinates results. 

Riboprobe design and generation 

The complete protocol for digoxigenin (Dig)-labeled 
riboprobe synthesis is available and described in detail 
on the GUDMAP gene expression database [80]. Pri- 
mers were ordered from Invitrogen and were designed 
to amplify a 3' UTR region of the RIKEN Fantom3 
cDNA clone models, between 500 and 800 bp. Ribop- 
robes were amplified from 15.5-dpc whole embryonic 
mouse cDNA. The 3' primers were tagged with a T7 
polymerase (Roche), for in vitro transcription of Dig- 
labeled riboprobes. Riboprobes were then purified with 
lithium chloride precipitation and stored at -20°C over- 
night. Samples were then spun for 20 min at 4°C with 
supernatant discarded after the spin, gently washed with 
of chilled 70% ethanol, and then spun at 4°C. Superna- 
tants were discarded and samples dried for 10 min at 
room temperature where pellets were then resuspended 
with 25 [A of water and stored at -70°C. 

Section in situ hybridization validations 

The complete protocol for section in-situ hybridization 
(SISH) is available and described in detail on the GUD- 
MAP gene expression database [80]. For Dnrn3os, man- 
ual SISH was performed using NTM-based dye. The 
complete protocol is described in [81]. Briefly 7 um 



paraffin sections of 15.5 dpc CD1 mouse kidneys incu- 
bated in 10 ug/ml proteinase K for 20 mins at room 
temperature. Next, samples were washed and refixed 
with 4% paraformaldehyde for 10 mins at room tem- 
perature. This is followed by acetylation and pre-hybri- 
dization using hybridization solution for 2 hrs at room 
temperature. Hybridization was carried out overnight at 
60°C. Slides were then washed by NT buffer at room 
temperature before incubating for 2 h with blocking 
solution in a humidified chamber. A 1:1000 dilution of 
anti-digoxigenin antibody (Roche Applied Science) in 
blocking solution was added to the slides and incubated 
overnight at 4°C. Unbound antibodies were removed by 
washing in NT buffer. Sections were equilibrated in 
NTM buffer and incubated in color solution until purple 
staining was satisfactory. 

Quantitative RT PCR 

To validate Wtl splice event (spliced exons 4 and 5) 
detected from the RNA-Seq data, the mRNA levels of 
the uncharacterized event was compared against a well 
characterized splice event (spliced exon 5) of Wtl, PCR 
was performed in quadruplicates using matched sample 
that was used to generate the RNA-Seq cDNA libraries. 
Samples were run with Actin housekeeping gene as a 
positive control. Primers were designed to span across 
exon junctions 3 and 6 junctions (Kidney_Wtl_minus 
exons 4 and 5) (Forward: CCCCTACTGACAGTTG- 
CACA; Reverse: TACTGGGCACCACAGAGGAT). As a 
control, primers were also designed for a known Wtl 
splice event (Kidney_Wtl_ctrl minus exon 5 (known)) 
(Forward: CTTGAATGCATGACCTGGAA; Reverse: 
TACTGGGCACCACAGAGGAT). Relative mRNA 
expression of Kidney_Wtl_minus exons 4 and 5 was 
compared to the "known" event and reported as relative 
mRNA fold abundance. 

Ethics statement 

All animal work contributing to this manuscript was 
conducted according to all state, national and interna- 
tional guidelines. Animal ethics approval was provided 
by AEEC3 of The University of Queensland (Approval 
IMB/572/08/NIH (NF)). 

Additional material 



Additional file 1: RPKM for RefSeq loci. RPKM calculation and tag 
abundance of non-redundant RefSeq loci (RefSeq loci compiled by UCSC 
Genome Browser). 

Additional file 2: Overlapping antisense expression for Lhxl. UCSC 
screenshot of Lhxl (negative strand) and antisense expression (positive 
strand). Previously unassigned Affymetrix probe 1439232_at aligned with 
overlapping (head-to-head) antisense transcript 1 50001 6L03Rik with 
corresponding heatmap of kidney subcompartment expression. 
Microarray compartments from top to bottom of heatmap: ureteric tip; s- 
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shaped body; proximal tubule; cortical, and medullary interstitium; 
medullary, and cortical collecting duct; renal corpuscle; cap mesenchyme; 
loop of Henle; renal vesicle. 

Additional file 3: Extended 3' UTR signal. Transcripts with tags beyond 
annotated 3'UTR within a 20 kb window. 

Additional file 4: Alternative Exon Junctions (> 5 tags). Loci with 
alternative splicing supported by a minimum of 5 exon junction tags. 

Additional file 5: Ret isoforms. UCSC screen shot of Ret locus. RefSeq 
gene model representation of Ret isoforms Ret51 (top) and Ret51 
(bottom). Difference within the C-terminal end of gene is captured by 
RNA-Seq exon junction tags and signal. Microarray compartments from 
top to bottom of heatmap: ureteric tip; s-shaped body; proximal tubule; 
cortical, and medullary interstitium; medullary, and cortical collecting 
duct; renal corpuscle; cap mesenchyme; loop of Henle; renal vesicle. 

Additional file 6: mRNA expression level measured by qRT-PCR for 
Wtl splice events. Kidney_Wt1_ctrl minus exon 5 (known) represents a 
previously well characterized Wtl splice event where exon 5 has been 
spliced out. Kidney_Wt1_minus exons 4 and 5 (Ensembl transcript: 
ENSMUST000001 1 1 100) represents uncharacterized splice event where 
exons 4 and 5 are spliced out. The expression ratios were averaged from 
quadruplicates runs. Kidney_Wt1_minus exons 4 and 5 was compared 
against the "known" splice event which shows that the minus exons 4-5 
event is expressed at a higher level than the "known" event. 

Additional file 7: Alternative Exon Junctions (> 5 tags). Exon junction 
tags with reference (UCSC Genes) or non-reference (other models) 
evidence of alternative 5' end usage. 

Additional file 8: Alt. 5 prime junction tags. Loci with alternative 
splicing supported by a minimum of 5 exon junction tags. 

Additional file 9: Embryonic kidney microRNAs. Tag abundance of 
mature miRNAs based on mapping to hairpins (mirBase version 15). 

Additional file 10: Kidney Primary miRNA (pri-miRNA) transcripts. 

Intergenic pri-miRNA annotations from [5]]with corresponding miRNAs 
expressed in 15.5 dpc kidney. Affymetrix probeset ID's representing pri- 
miRNA are also provided. 

Additional file 11: Sense and Antisense transcripts. Antisense 
transcripts overlapping RefSeq transcripts detected in developing mouse 
kidneys. 
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